Atomization in Grammar Sharing

نویسنده

  • Megumi Kameyama
چکیده

new insights with which to account for certain linguistic We describe a prototype SK~RED CmAt~eAR for the syntax of simple nominal expressions in Arabic, E~IL~lx, French, German, and Japanese implemented at MCC. In this Oamm~', a complex inheritance ian/cc of shared gr~mmAtlcal templates provides pans that each language can put together to form lansuug~specific gramm-ti~tl templates. We conclude that grammar shsrin 8 is not only possible but also desirable. It forces us to reveal crossliuguistically invm'iant grammatie~ primitives that may otherwise rem~ conflamd with other primitives if we deal only with a single ~.nousge or l-n~uuge type. We call this the process of OaA~O~AT~CAL ̂TOI~aZAT~ON. The specific implementation reported here uses catcgorial tmifr, ation grammar. The topics include the mono-lcvel nominal category N, the functional distinction between ARGUMENT and NON-ARGUMENT of nominals, grammatical agreement, and word order types. Is grammar sharing possible? The multill.eual pmjec~ of MCC a~mpts to build a grammatical system hierarchic~tily shared by multiple languages (Slucum & Justos 1985). ~ ~ as proposed should have an advantage over a system with separate grammars for different languages: It should reduce the ~ of a mnllflinsual rule base, and fecilltat~ the addition of new languages. Bef~e Inesenting evidence for such advantages, however, there is the basic question m be answered: Is grammar sharing at all possible? Although it is well known that languages possess similarities based on genetic, typological, of areal grounds, the question remains whether and how these ~imilarities translate into computational techniques. In this paper, we will describe a prototype shared for simple nominal expressions in Arabic, English, French, German~ and Japanese. x We conclude that grammar sharing is not only possible but also desirable. It forces us to reveal crces-liuguiatic~y invariant grRmmAtiCal primitives that may otherwise confiated with other primitives if we deal only with a single language of language type. We call this the process of ~Tlf.~. ATOMmA~ON 2 forced by grammar sharing. Each language or language type is then characterized by particular combinations of such primitives, often providing Xpreliminary investigations have also been made on Spanish, Russian, and Chinese. 2The verb atom/ze means "to separate of be separated into free atoms" (The Collins English Dictionary, 2nd edition, 1986). problems. Before we go into more derail, the following is our view of what general components and mechanisms COllStiUlle 8 shared g r ~ n t l e ~ l SyStemBask mechanisms In a shared grammar:. The process of buildiug a shared grammaT, in our view, requires (i) linguistic description of a set of languages in a common theoretical framework, (ii) a mechanism for E~1~ACr1~O a common grammatical asse~on from two or more assertions, and (fii) a mechanism for MEROINO grammatical asse~ous. The linguistic description should define certain string-combination operations (defined on siring I"YI~) associated with information structures. Then what we do is identify shamble packages of common string-types and information slmctures among independently motivated languuge-spccific grammatical assertaions. These packages are then put into the shared part of the grammnr D and the remaining language-specifics are potential sources for mofe sharing. This extraction is essential in what we call ATOMIZATION, which is basically "breaking up of grammatical a ~ g i o n s into mai ler independeot parts" (Le. decomposition). If we assume that all grammatical aase~iem ~e expressed in terms of FEAI"ORE ST~UCTtn~ES (Shieber 1986), the atomi.Jtlon process would be defined mound the notion of <~2q~.,,H~TION (i.e. reverse of Ut~C.A~ON) as follows: basic a t~s /za~a. . Given two feature structures, Xa for category X in language A end Xb for category X in language B, the shared m'ucture X~t for category X is the ~ ' n O N of Xa and Xb (i.e., the must specific feature slmcmm in commnn with both Xa and Xb). X a is separated out of eithar Xa or Xb, and placed into the shared space. Consequently, a ~ ofdering is established wlm~fin X a s u e ~ Xa and Xb, respectively. There is an underlying assumption that two languagespecific de~uitiom of a commn~ grammatical camgony share something in comn~a no matter how small it is. This means that the linguis~ descriptive basis is questionable if the content of Xa above is nulL Conversely, if clo~ly common information structures appear under languagespecific definitions of distinct grammatical categories, we may suspect a basis for a new common grammatical category. Once the shared and iauguage-spucific pm'ts are separated out, a mechanism for merging them is necessary for successfully incorporating the shared assertion into the language-specific assertion. ~m~c.ATIO~ by n ~ r r ~ . ~ c ~ is such a merging mechanism that we employ in our system (see below). The shared space is a complex inheritance lattice that provides various predefined grammatical assertions that can be freely merged to create languagespecific ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of molecular atomization energies using machine learning

Atomization energies are an important measure of chemical stability. Machine learning is used to model atomization energies of a diverse set of organic molecules, based on nuclear charges and atomic positions only [1]. Our scheme maps the problem of solving the molecular time-independent Schrödinger equation onto a non-linear statistical regression problem. Kernel ridge regression [2] models ar...

متن کامل

MODELING OF RAPID SOLIDIFICATION PROCESS IN THE GAS ATOMIZATION OF MOLTEN METALS

In the present work, a model was proposed to predict the thermal history during rapid solidification (RS) of metal droplets in the gas atomization process. The classical theory of heterogeneous nucleation was based on Newtonian heat flow and enthalpy method. Solving the governing numerical equations by the finite difference method (FDM) gave up the opportunity of analyzing the temperature-time ...

متن کامل

Assessment of DFT Functionals in Predicting Bond Length and Atomization Energy of Catalytically Important Metal Dimers

In the present investigation, the results of extensive benchmarking study of density functional theory (DFT) methods on some catalytically important metal dimers have been reported. The calculations were carried out on Al2, Ti2, V2, Cr2, Mn2, Fe2, Co2, Ni2, Cu2, and Zn2 using DFT functionals such as GGA, meta GGA, hybrid meta GGA along with recently developed Minnesota functionals. The bond len...

متن کامل

Towards an Optimal Gradient-dependent Energy Functional of the PZ-SIC Form

Results of Perdew–Zunger self-interaction corrected (PZ-SIC) density functional theory calculations of the atomization energy of 35 molecules are compared to those of high-level quantum chemistry calculations. While the PBE functional, which is commonly used in calculations of condensed matter, is known to predict on average too high atomization energy (overbinding of the molecules), the applic...

متن کامل

DMMP Sensing Performance of Undoped and Al Doped Nanocrystalline ZnO Thin Films Prepared by Ultrasonic Atomization and Pyrolysis Method

Highly textured undoped (pure) and Al doped ZnO nanocrystalline thin films prepared by ultrasonic atomization and pyrolysis method are reported in this paper. ZnCl2 water solution was converted into fine mist by ultrasonic atomizer (Gapusol 9001 RBI Meylan, France). The mist was pyrolyzed on the glass substrates in horizontal quartz reactor placed in furnace. The Structural and microstructural ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1988